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O ' Abstract 

^^ I This work presents a set of new statistics, the cumulant correlators (CC), aimed at high precision 

^^ ' analysis of the galaxy distribution. They form a symmetric matrix, Qnm, related to moment correlators 

the same way as cumulants are related to the moments of the distribution. They encode more information 
than the usual cumulants, iSat's, and their extraction from data is similar to the calculation of the two-point 
correlation function. Perturbation theory (PT), its generalization, the extended perturbation theory (EPT), 
and the hierarchical assumption (HA) have simple predictions for these statistics. As an example, the 
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H ^ , 

'^ ' factorial moment correlators measured by Szapudi, Dalton, Efstathiou & Szalay (1996, hereafter 3DES) in 

Cd ■ the APM catalog are reanalyzed using this technique. While the previous analysis assumed hierarchical 

structure constants, this method can directly investigate the validity of HA, along with PT, and EPT. The 
is^ results in agreement with previous findings indicate that, at the small scales used for this analysis, the 

j_j ■ APM data supports HA. When all non-linear corrections are taken into account it is a good approximation 

5t ! &t the 20 percent level. It appears that PT, and a natural generalization of EPT for CC does not provide 

such a good fit for the APM at small scales. Once the validity the HA is approximately established, CCs 
can separate the amplitudes of different tree-types in the hierarchy up to fifth order. As an example, the 
weights for the fourth order tree topologies are calculated including all non-linear corrections. 

keywords large scale structure of the universe - galaxies: statistics - methods: data analysis - 
methods: statistical 



1. Introduction 



Direct determination of higher order correlation functions ( FVy & Peebles 1975 , [Peebles 1980| , 
and references therein) is burdened with the combinatorial explosion of terms, which severely 
complicates their measurement and interpretation. Thus in the recent years indirect methods 
became increasingly popular for high precision measurements of higher order correlations. The 
simplest of these methods consists of calculating the (factorial) moments of the distribution 
of counts in cells, and from that, the cumulants, S'at's, of the underlying distribution (see e.g. 
Peebles 1980, Gaztaiiaga 1992, Bouchet et al. 1993, Gaztahaga 1994, polombi et al. 1995 , Szapudi, 



Meiksin, & Nichol 1996). For a point process, these quantities measure the amplitude of the 
A^-point correlation function averaged in a particular window. The advantages of this technique 
lie in its simplicity, and its direct relation to the predictions of PT (Peebles 1980, Juszkiewicz, 



Bouchet, &: Colombi 1993| , [BernardeauTM^ , |BernardeauT99^ , EPT ( [Colombi et al. 199q ) and 
the HA ( Peebles 1980 ). Since the averaging causes a significant loss of information, alternative 



methods based on moment correlators use a pair of cells ( Szapudi, Szalay fc Boschan 1992 , Meiksin, 



Szapudi, &: Szalay 1992 , 3DES). In the past such methods were used mainly to estimate the 
average amplitude of the different A^-point correlation functions in the HA, the Qat's, motivated 
by the theory of the BBKGY equations in the strong clustering regime. This work presents an 
alternative analysis of the factorial moment correlators which is free of assumptions, except for 
the widely accepted infinitesimal Poisson model to relate the continuum limit quantities to the 
measured discrete process. Instead of fitting for the Qn-, a matrix Qnm is defined: the CCs. 
Both HA and PT have specific predictions for these possibly scale dependent quantities. After 
elaborating these predictions, the method is illustrated by reanalyzing the factorial moment 



correlators obtained from the APM catalog by 3DES. Once the HA is established, CCs contain 



enough information to separate the weights of different tree topologies up to fifth order. The next 
section outlines the basic theory, section §3 presents the predictions of PT, EPT, and HA. The 
measureinents of the 4th order coefficients of the hierarchy from the APM catalog are described 
in section S4. 



2. Theory 



Following pDES we define the factorial moment correlators for a pair of cells separated by a 
distance ri2 as 

i(NA,JNn^,-)-^(m,M(m,^) 
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and the normalized factorial moments for a single cell 
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The notation {N)}^ = N[N — \)..{N — 1 + A;) is introduced for the factorial moments of the counts in 
cells, () denotes averaging over all cell positions in the survey. The connection with the fluctuations 



of the underlying field, 5, can be obtained by formally substituting {N)k/ {N) ^ (1 + 5)^ . The 
generating function for the factorial moments in terms of the cumulants Qn is 

oo 

W{x) = exp Y. Ynx'^Qn, (3) 

N=l 
with 

TwN-2f:N~l 

where ^g = o") the variance in a cell. The generating function can be written in the above form 
for any distribution that has cumulants. Generally, the Qat's can have a scale dependence, while 
for the HA Qat = const is expected. Note the connection with the popular alternative notation, 
Sn = QnN^'"^ exactly. Similarly, the generating function of the factorial moment correlators can 
be written as 

W{x, y) = W{x)W[y) (exp Q(x, y) - 1) , (5) 

with 

oo 

Q{x,y)=ii Y. x^y^'QNMTMrNNM. (6) 

A/=l,Ar=l 

This latter equation defines the CCs, Qnm, with ^i = wu, the two-point correlation function 
between the cells. Typically in the APM survey, ^/ <C Cs(= o") < 1 Note that the linear dependence 
is factored out, however, Qnm is not necessarily a constant. 

Cumulants and CCs are related to the continuum limit connected moments because of the 
continuum properties of the factorial moments 

= QnTn (7) 



m 
^,J/ =QNMrMrNNM^i. (8) 

Although the above equations are formally identical to |SDES, there are two subtle differences: 



there is no reference to the hierarchical assumption, therefore Qnm becomes a matrix, and it is 
understood as an exact equation, i.e. the non-linearities are included. It is convenient to define 
CCs linear in ^i, denoted by Qnm, which are obtained from the generating function with the 
approximation of exp Q{x, y) — 1 ~ Q{x, y) + 0{^f). The Qnm^s coincide up to normalization with 
the Cnm^s calculated from PT by |Bernardeau 1995| (see next section). Note that in the following 
linear and non- linear always refers to powers of S,i- 

The CCs can be calculated for any well behaved point process by expanding 
W{x,y)/[W{x)W{y)] according to equation ^. For instance the third and fourth order 
moments are 

Qi2riT22^i = W12/2 - 6 (9) 

Ql3rir336= Wi3/6-Wi2/2-W20/2 + ^l (10) 

Q22TU(i = W22/^ - Wi2 + e/ - Cf/2, (11) 



and follows that Q22 = Q22 — C//2C, 



3. Predictions 

In the highly nonhnear regime, the HA (e.g., [Peebles 1980| ; BS) states that the A^-point 
correlation functions can be written as a sum of products of A^ — 1 two-point correlation functions. 
Each product corresponds to a tree spanning the A^-points, and there is a summation over all 
possible trees. The different tree topologies, labeled with k, are weighted with a constant QNk- 
Our notation in detail can be found in Boschan, Szapudi, & Szalay 1994, Szapudi fc Colombi 1996J 



One of the goals of this paper is validate the HA to an unprecedented accuracy. 



Comparing Equation |6| with 3DES, and ^zapudi &: Szalay 1993 , yields a linear order 
prediction for the HA 

Qn+m - Qnm - const. (12) 

For instance the 4th order cumulant (54 is approximately equal to the linear CCs Q13, Q22, and 



constant, etc. While form factors from the smoothing were shown to be negligible by Boschan, 



[Szapudi, fc Szalay 1994 , different tree topologies and non-linear corrections will be taken into 



account next for a more accurate prediction. 

The only 3rd order CC is Qi2- Tree graphs spanning three points have only one possible 
topology (its weight denoted by Q3 with form factors neglected), giving altogether three possible 
graphs. 

[6I62) = 2Q1266 = Q3(2ez6 + e?), (13) 



reproducing Q12 = Q3 at linear order. 

At fourth order there are two CCs Q13, and Q22- The sixteen possible trees spanning four 
points come in two distinct topologies: four "snake" graphs and twelve "star" graphs. Their 
respective amplitudes are denoted with Ra and i?^ in the HA. Summing all possible graphs with 
the appropriate statistical weights gives 

{S!S2)^ = 9Ql3^lfs = Q^lfsRa + 36diife + G^f^sRa + CfRb, (14) 

and 

{6l6i)^ = AQ22Cies = ^ililRa + ^ihsRa + ^ifisRb + ^if Ra- (15) 

These two equations are linear in Ra and i?fe, therefore they can be solved yielding equations (with 
non-linear coefficients in terms of ^) in terms of Q13 and Q22- The linear solution is Ra = Q22 and 
Rb = 3Qi3 — 2Q22- 



Direct comparison of Equation ^ with the coefficients Cnm in Bernardeau 1995 reveals that 
they are identical to the linear order CCs up to normalization 

Cnm = QnmN^'-'M^'-' + O(eF). (16) 

Perturbation theory predicts that the coefficients factorize such that 

Cnm = CniCmi, (17) 



and the series Cjyi was calculated up to first non-trivial order. The interested reader is referred 



to Bernardeau 1995 for detailed predictions in the weakly non-linear regime, for the present work 



only Equation 17 is needed. 



Although biasing is not investigated in this paper, it is worth to note that it can significantly 



change the higher order correlations. In the weakly non- linear regime the results of Fry & 



Gaztahaga 1993 should be generalized for CCs. Such a calculation, which is left for subsequent 



research, will resolve the remaining ambiguities in the interpretation of CCs. 



4. Measurements from the APM Catalog 

For an initial assessment, the linear CCs were first calculated from the factorial moment 



correlators measured in the APM survey ( Maddox et al. 1990a , Maddox et al. 1990a , Maddox 
et al. 1990(j ) by gDES| . In what follows, a density map of cell size 0.23° and magnitude cut of 
bj = 17 — 20 was used (see SDES for the detailed properties of the density maps). The bottom 



panel of the Figure shows the measured qnai^s (the linear projected CCs; lower case symbols 
refer to projected quantities) up to fifth order. To interpret the figures note that the CCs are 
characterized by two relevant scales: the angular separation, and the smoothing scale, or cell size. 
On the figures, only the separation is shown in degrees, (1° ~ Jh^^Mpc for this magnitude cut), 
while the smoothing length (always 0.23°) remains implicit. The degeneracy and the approximate 
parallel nature of the curves immediately suggest that the HA is a reasonable approximation. At 
larger scales the CCs appear to roll off, while the prediction stays flat, and the degeneracy of the 
curves is slightly broken. This is mainly due to fact that linear CCs were used, and cumulants are 



not exactly constant at all scales as shown by [Gaztahaga 1994| , pzapudi, Meiksin, &: Nichol 1996 
(i.e. HA is slightly broken). 



The middle panel of the Figure, illustrates equation 17 predicted by leading order PT. The 
solid lines are the CCs q^MiN^M > 1, while the dotted lines show the corresponding qiNQiM- 
Only the fourth and fifth order are shown. The degree of validity of PT can be judged from how 
well the dotted and solid lines match. Since the dotted lines appear to be consistently smaller 
than the solid ones this model provides a less accurate description of the data than HA. Possibly, 
higher than leading order PT could improve the representation of the data; it is left for future 
work. 

It can be argued, that PT for the CCs is valid when both relevant scales are in the weakly 
non-linear regime. While PT matches the higher order correlations in the APM for larger scales 
( Gaztahaga fc Frieman 1994| ), for the small cell size used in this work non-linearities can be 
important for the present measurement ( |Baugh &: Gaztahaga 1994 ). However, it was found in 
A^-body simulations ( Colombi et al. 1996 ), and galaxy data ( [Szapudi, Meiksin, Sz Nichol 199q ), 
that the higher order correlation amplitudes, Qn, measured from counts in cells are similar to 
the one prescribed by PT, but with a steeper power spectrum. This phenomenological extension 
of PT is the essence of EPT. The previous exercise taken at face value would suggest that EPT 



cannot be generalized for moment correlators. A rough estimate of the errors based on Equation 



12 with scaling the variance from Gaztahaga 1994, and Gaztahaga 1996 (private communication) 



yields 5%, 7%, and 7% for the third, fourth, and fifth order respectively. These error-bars, which 
are not necessary conservative, could only marginally exclude the natural extension of EPT at 
small scales. Further measurements in A'^-body simulations, and high quality data are needed to 
show, whether the EPT paradigm can be applied to GCs. 

The HA can be examined with further scrutiny by relaxing the previous assumptions on 
linearity and uniform weighting of topologies. The form factors resulting from the pair of cells 
are expected to be smaller than the measurement errors and will be still neglected. Counting the 
number of degrees of freedom reveals that from the cumulants and CCs it is possible to separate 
the different tree topologies up to fifth order. A calculation for the third and fourth order is 
presented here. The fifth order calculation is analogous, although somewhat tedious. At higher 
than fifth order additional information is needed to separate the different graph types. 

The long dashed line on the top panel of the Figure shows the non-linear measurement of q^ 
as calculated from g2i of the APM according to Eq. ^. The dotted lines show the linear solution 
Ta, and rfy as computed from q22, and ^31. The hierarchy predicts two horizontal lines, with the 
constraint that 16g3 = 12ra + 4rfe. The linear approximations on the other hand show a strong 
scale dependence, increasing and even crossing over at the smallest scales: a possible sign of 
non-linear effects. The full non-linear equations (0, ^) yield the result plotted with solid lines: 
the non-linear corrections remove most of the scale dependence, as expected if HA is satisfied. 
The residuals are probably due to the neglected form factors, measurement errors. On the left 
side of the panel several amplitudes are plotted for comparison; for these points the angular scale 
is irrelevant. The three sided symbols refer to third order quantities, the four sided to fourth 
order. The filled triangles and squares shows the value of q^ = 1.15 and 54 = 2.2 calculated from 
the averaged value of (721 = 1-15, and ra = 1.15, and r^ = 5.3, respectively. The open symbols 
correspond to the values of (73 = 1.7, and (74 = 4.17 measured from the factorial moments alone, 
Wko, at the scale of the cells. For a comparison, the two stars show the respective measurements of 
5DES (73 = 1.16, and q^ = 1.96. The reason that ^DES measured a somewhat lower ^4 is that they 



used linear approximations (dotted lines) only. The measurements of 53 = 1.7 by Gaztahaga 1994 



in the APM and 53 = 1.6 ^zapudi, Meiksin, fc Nichol 1996 at the same cell size, are in excellent 



agreement with the results from Wko- The values for the fourth order in the same sources, 54 = 3.7 
and 94 = 3.2, are slightly lower than above, but the agreement is still within 20 — 30 percent. 

The above numbers suggest that, while the different measurements using the same method 
are consistent with each other even in different catalogs, there is some disagreement between the 



results based on moment correlators and moments. The error distribution studied by Szapudi & 



Colombi 1996| provides useful clues to resolve this apparent discrepancy. Since the distribution of 



errors is positively skewed and increasingly so for higher order moments, an upward fiuctuation is 
more likely than a downward. This effect is increasing with the order of the moments measured. 
In the method proposed by this work ^3 is estimated from the value of 521- The behavior of the 
errors is similar to the multiple of a second and first order quantity, thus the variance is reduced. 



Note that this is possible, only after the hierarchy is established, i.e. a prior information is used 
to reduce the scatter from cosmic errors. An accurate error estimation in this case would involve 
a tedious calculation, a non-trivial generalization of pzapudi fc Colombi 1996 . 

The de-projection using the coefficients in SDES| yields Q3 = 1, Ra = 0.8, and R^ = 3.7, giving 
Q4 = 1.5. This is to be compared with with Fry & Peebles 1978, where the direct determination of 
the four-point correlation function from the Lick catalog yielded Ra = 2.5 it 0.6 and Rb = 4.3 it 1.2. 
These results could give a clue for solving the BBKGY equations in the highly non-linear regime. 
The assumption of [Hamilton 1988 , that only the snake graphs have a contribution, appears to be 
close to our results: although both graph types have a contribution, the average is closer to the 

Q3, is not a particularly 



snake coefficient. The ansatz of Bernardeau fc Schaeffer 1992 , \/R^ 
good approximation. In conclusion the statistics of the CCs is in excellent agreement with HA. 
The method outlined here in conjunction with future data and A^-body simulations will be able to 
pin down the amplitudes of the higher order correlations with unprecedented accuracy. 
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5. Figure Caption 

Lower Panel. The linear CCs, qnm^ the main raw results of the paper are displayed up to fifth 
order as a function of the angular separation of cells in degrees. The parallel degenerate lines 
suggest the HA. 

Middle Panel. The linear CCs are shown on a linear scale (solid lines) together with the prediction 
from PT (dotted line). The agreement is improving towards the higher scales. 

Upper Panel. The hierarchical amplitudes as calculated from the fully non-linear CCs are 
displayed. The long dashed line corresponds to the estimator of (73, the solid lines to the estimator 
of Ta, and rfe, the amplitudes of the fourth order snake, and star graphs, respectively. The dotted 
lines show the linear approximation, which breaks down at smaller scales at this level of precision. 
The filled symbols mark 53 (triangle), and (74 (square) as calculated from the moment correlators. 
The open symbols are the same as measured from the moments of counts in cells only. Finally, the 
crosses show the measurements of 53 (triangular), and q^ (square) by |SDES for comparison. 
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